programming4us
           
 
 
SQL Server

Setting Up a Full-Text Index (part 2) - Full-Text Indexing of BLOBs and XML

- Free product key for windows 10
- Free Product Key for Microsoft office 365
- Malwarebytes Premium 3.7.1 Serial Keys (LifeTime) 2019
12/18/2010 5:25:46 PM
Full-Text Indexing of BLOBs and XML

SQL Server 2008 can natively index content columns of the char, nchar, varchar, nvarchar, text, and xml data types. If you want to index binary large objects, you need to store them in the image or varbinary(max) column and associate with the image column a column that will contain the extension the document would have if it were stored in the filesystem. For example, if you were storing a Word document in the image or varbinary(max) column, the document type column would have the value doc. While indexing the contents of the image or varbinary(max) column, the Indexer reads the value of the document type column for that row and launches the IFilter that corresponds to that value. SQL Server 2008 ships with many IFilters. You can tell which document extensions have IFilters by querying sys.fulltext_document_types:

SELECT * FROM sys.fulltext_document_types

If you are indexing a document stored in the image or varbinary(max) data type for which the extension is not listed in sys.fulltext_document_types, the indexer is unable to index the document. To enable indexing for unsupported document types, you must do the following:

1.
Download the IFilter for that document type and install it on the server running SQL Server.

2.
Enable the third-party IFilters to be used in SQL Server FTS. You do this by issuing the following commands:

Exec Sp_fulltext_service 'load_os_resources', 1
GO
Exec Sp_fulltext_service 'verify_signature', 0
GO

LANGUAGE

By default, the content in the columns you are full-text indexing is broken by the word breakers according to the language rules for the default full-text index language setting for your instance of SQL Server. You establish this setting by issuing the following command:

sp_configure 'default full-text language'
go

name minimum maximum config_value run_value
----------------------------------- ----------- ----------- ------------ ---------
default full-text language 0 2147483647 1033 1033


Note the value for run_value. This is the locale identifier (LCID). To determine which language the LCID corresponds to, you issue the following:
SELECT name FROM sys.fulltext_languages WHERE lcid=1033
go

name
----------------------------------------------------------
English

In this example, 1033 is the value returned for run value in the sp_configure query. Note that this returns a list of the language word breakers that ship by default with SQL Server 2008.

The preceding execution of sp_configure returned the default full-text value of 1033, which corresponds to English. Microsoft recognizes two types of English in all Microsoft search products: English (U.S. English) and British English (International English). There are very slight differences between the two word breakers, mainly due to differing suffixes and spellings (for example, British English recognizes connexion and colour as legitimate spellings).

By default, all columns are full-text indexed by the word breaker that corresponds to your default full-text language settings for your instance of SQL Server.

SQL Server FTS allows you to use the language tag to specify word breakers for different languages to be used to full-text index columns. For example, if you are storing Traditional Chinese content in a column you want to index, and you want it to be indexed using Traditional Chinese, you could issue the following statement to create a full-text index:

CREATE FULLTEXT INDEX ON Person.Contact(FirstName,
LastName LANGUAGE 1028)
KEY INDEX PK_Contact_ContactID ON MyCatalog

This example full-text indexes two columns; one called FirstName is indexed using the server default full-text language, and the other, called LastName, is indexed using the Traditional Chinese language word breaker. This means that what ends up stored in the full-text indexes is broken according to the language rules of the word breaker. For U.S. and International English, the words are primarily broken at whitespace or word boundaries (that is, punctuation marks). For other languages, the word may be broken into constituent words or alternate words. For example, if you use the German word breaker, wanderlust is broken as wanderlust, wandern, and lust, and all three words are stored in the index; searches on wanderlust, wandern, and lust all return hits to rows containing wanderlust.

You can specify different language settings for each column you are full-text indexing, but you can assign only one language setting for each column.

If you are storing BLOBs in the columns of the image or varbinary data type and have a document-type column assigned to these columns, depending on your content, the language settings within the content themselves may override the language setting you specified to be used for your full-text index or your SQL Server default full-text language settings. For example, if you are indexing HTML or Word documents, have marked these documents as Chinese, and have specified that the documents be indexed in German, if your SQL Server default full-text language setting is French, the content is indexed as Chinese. The same holds true for XML documents stored in columns of the xml data type: the xml:lang setting determines the language in which these documents are indexed.

ON FULLTEXT CATALOG

The ON FULLTEXT CATALOG parameter allows you to place your full-text index in a specific catalog. If you have a default full-text catalog for the database, you do not need to specify a catalog. You get better indexing and querying performance if you place larger tables in their own full-text catalogs.

KEY INDEX

SQL Server FTS must be able to identify the row that it is indexing or that is returned in the query results. You specify which column is to be used as the key by using the KEY INDEX parameter in your full-text index creation statement. As mentioned previously, this column must be unique and non-nullable, and it must have a single-column index that is not offline and have a maximum size of 900 bytes. It can be a unique index or your primary key.

POPULATION TYPE

The process in which the indexer extracts your table content and builds a full-text index is called population. There are three types of populations:

  • Full

  • Incremental

  • Change tracking

No matter what population type you choose, a full population is initially done first. The full population extracts rows in batches and indexes them. It does not do any change tracking, so your catalog starts to become out-of-date as soon as the population completes.

An incremental population occurs if there is a time stamp column on the table you are full-text indexing. The incremental population extracts each row to determine which rows have been updated and re-indexes only the changed rows. It also determines which rows have been removed from the table you are full-text indexing. A row is flagged to be re-indexed if any of the columns are updated, so if you update one of the columns that is not being full-text indexed, this row is indexed again.

You should use incremental populations rather than full populations when a significant amount of your table’s contents changes at any one time. If the bulk of your table changes—around 90%—a full population is faster than an incremental population.

You use the following commands to do a full population and an incremental population:

Use AdventureWorks;CREATE FULLTEXT INDEX ON Person.Contact(Firstname)
KEY INDEX pk_Contact_ContactID WITH CHANGE_TRACKING OFF, NO POPULATION

To then start a full or incremental population, you issue the following for full and incremental populations, respectively:

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact START FULL POPULATION — FULL POPULATION

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact START FULL INCREMENTAL — INCREMENTAL POP



At all other times, you should use change tracking because it is much more efficient and offers near-real-time indexing. Change tracking indexes rows that have had the columns you are full-text indexing modified in near-real-time. Change tracking starts by doing a full population but does an incremental population if a timestamp column exists on the table. Change tracking (like other population types) causes some locking on the tables you are full-text indexing, so you have an option to schedule when the indexing of the modified rows is done.

By default, when you create a new full-text index, change tracking is enabled. In other words, a full population is done and when it completes, all rows modified during the full population and after it completes are indexed. So the following statements are equivalent:

Use AdventureWorks;CREATE FULLTEXT INDEX ON Person.Contact(Firstname)
KEY INDEX pk_Contact_ContactID WITH CHANGE_TRACKING AUTO

Use AdventureWorks;CREATE FULLTEXT INDEX ON Person.Contact(Firstname)
KEY INDEX pk_Contact_ContactID

Because change tracking causes some locking, you can schedule rows to be tracked in real-time but indexed only at scheduled intervals by using the following statement:

Use AdventureWorks;
CREATE FULLTEXT INDEX ON Person.Contact(Firstname)
KEY INDEX pk_Contact_ContactID WITH CHANGE_TRACKING MANUAL

The preceding command assumes a default index. If you do not have a default catalog, you would have to specify a named one like this:

Use AdventureWorks;
CREATE FULLTEXT INDEX ON Person.Contact(Firstname)
KEY INDEX pk_Contact_ContactID ON DEFAULT_FULLTEXT_CATALOG WITH
CHANGE_TRACKING MANUAL

To update your index, you issue the following

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact START UPDATE

ALTER FULLTEXT INDEX

As you have seen in this article, you can use the ALTER FULLTEXT INDEX command to manage populations. You can also use it for a wide variety of index maintenance tasks. Here are its parameters, which are discussed in the following sections:

ENABLE and DISABLE

The ENABLE and DISABLE parameters enable and disable full-text indexing on a table. When you use them, you can still conduct full-text searches on your full-text indexed tables, but the catalogs are no longer kept up-to-date.

For example, you could disable indexing with the following command:

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact DISABLE

And then you could re-enable indexing with the following:

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact ENABLE

When you re-enable a full-text index, change tracking commences to update changes that occurred while full-text indexing was disabled. If you disabled change tracking prior to disabling the full-text index, you have to run a full or incremental population to get your catalog up-to-date.

SET CHANGE_TRACKING

The SET CHANGE_TRACKING option allows you to control change tracking. For example, you can turn it off, turn it on, or schedule it. Because change tracking does cause some locking, you might want to schedule it during a quiet time when the database is not under load to minimize the impact of the locking.

Here is an example of the use of SET CHANGE_TRACKING:

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact SET CHANGE_TRACKING AUTO

The options for setting change tracking are as follows:

  • AUTO— Enables continuous real-time indexing.

  • OFF— Disables change tracking.

  • MANUAL— Provides continuous change tracking, but rows are indexed only when you issue the following command:

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact Start Update Population

ADD

You use the ADD parameter to add a new column to a full-text index. For example, consider Person.Contact, a table in the AdventureWorks database, with three char columns on it: Firstname, Lastname, and EmailAddress. You have already created a full-text index on Firstname and Lastname. You could add full-text indexing to EmailAddress by issuing the following command:

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact ADD(EmailAddress)

As soon as you add the new column, a full population is done to index the contents of the newly added column. You can disable it with the WITH NO POPULATION clause, as in this example:

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact ADD(EmailAddress) WITH NO POPULATION

You may get the following message:

Msg 7663, Level 16, State 2, Line 2
Option 'WITH NO POPULATION' should not be used when change tracking is enabled.


This message indicates the change tracking is on. To prevent a population starting immediately after adding the column, you would first have to disable change tracking and then make your change as illustrated in the following example:
ALTER FULLTEXT INDEX ON Person.Contact
SET CHANGE_TRACKING OFF
ALTER FULLTEXT INDEX ON Person.Contact ADD(EmailAddress)
WITH NO POPULATION

You also have the option to specify a specific word breaker to be used or a document type column to reference whether the column you add is an image or varbinary(max) column.

DROP

Like the ADD parameter, the DROP parameter allows you to drop a full-text column you are indexing. This parameter also supports the WITH NO POPULATION clause, which disables automatic re-indexing after you drop the full-text column. Here is an example of its use:

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact DROP (Firstname) WITH NO POPULATION

Again, you may get the following message:

Msg 7663, Level 16, State 2, Line 2
Option 'WITH NO POPULATION' should not be used when change tracking is enabled.


This message indicates the change tracking is on. To prevent a population starting immediately after adding the column, you would first have to disable change tracking and then make your change as illustrated in the following example:
ALTER FULLTEXT INDEX ON Person.Contact
SET CHANGE_TRACKING OFF
ALTER FULLTEXT INDEX ON Person.Contact DROP(EmailAddress)
WITH NO POPULATION

The DROP command can be used to drop all the full-text columns on a table.

START and STOP

The START and STOP parameters can be used to start and stop full, incremental, or update populations. Following is the typical syntax:

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact Stop Population

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact Start Full Population

The update population is used in conjunction with change tracking, for example, if you set up change tracking in manual mode like this:

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact SET CHANGE_TRACKING Manual

Use AdventureWorks;
ALTER FULLTEXT INDEX ON Person.Contact START Update Population

We’ve completed our look at the catalog and index creation statements. Next, we look at how to manage full-text catalogs and indexes.

Managing MSFTESQL

After you create full-text catalogs and indexes, you might need to manage the full-text engine. The command used to do this is sp_fulltext_service, which accepts the following parameters:

  • @action

  • @value

Following are the acceptable values for the @action parameter:

  • load_os_resources— Controls whether the full-text engine loads word breakers and IFilters that are not part of SQL Server but are installed in the OS. A value of 1 loads the OS word breakers and IFilters.

  • pause_indexing— Pauses the indexing process. During this pause, you can still query the full-text catalogs.

  • resource_usage— Is used for backward compatibility.

  • update_languages— Updates the language cache with recently installed word breakers.

  • verify_signature— Disables the checking of signatures for word breakers and IFilters when set to 0. When set to the default, 1, signatures are checked.

  • upgrade_option— Controls how SQL Server processes catalogs in a database that are restored or attached to SQL Server 2008. It accepts three values: 0, which forces attached or restored databases with full-text catalogs to be rebuilt; 1, which means the full-text catalogs’ metadata remains, but the catalog contents are deleted (these catalogs are queryable, but no results are returned until you rebuild them); 2, which means the full-text indexes are imported into the database (however, the results may be inconsistent because some of the full-text indexes are generated by the SQL 2005 full-text word breakers and not the SQL Server 2008 word breakers).

Now you know how to build full-text catalogs and indexes and modify them. The next section describes how to get information on the catalogs and indexes you build.

Other -----------------
- Implementing SQL Server 2008 Full-Text Catalogs
- How SQL Server FTS Works
- SQL Azure : Connecting to a SQL Azure Database (part 2) - Connecting from the Entity Framework
- SQL Azure : Connecting to a SQL Azure Database (part 1) - Connecting Using ADO.NET
- SQL Azure : Creating Databases, Logins, and Users (part 2)
- SQL Azure : Creating Databases, Logins, and Users (part 1)
- SQL Azure : Azure Server Administration (part 3) - Databases
- SQL Azure : Azure Server Administration (part 2) - Firewall Settings
- SQL Azure : Azure Server Administration (part 1) - Server Information
- SQL Azure : Managing Your Azure Projects
- SQL Azure : Creating Your Azure Account
- An OLAP Requirements Example: CompSales International (part 16) - Security and Roles
- An OLAP Requirements Example: CompSales International (part 15) - SSIS
- An OLAP Requirements Example: CompSales International (part 14) - Data Mining
- An OLAP Requirements Example: CompSales International (part 13) - Cube Perspectives
- An OLAP Requirements Example: CompSales International (part 12) - Generating a Relational Database
- An OLAP Requirements Example: CompSales International (part 11)
- An OLAP Requirements Example: CompSales International (part 10)
- An OLAP Requirements Example: CompSales International (part 9) - Browsing Data in the Cube
- An OLAP Requirements Example: CompSales International (part 8) - Aggregating Data Within the Cube
 
 
 
Top 10
 
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 2) - Wireframes,Legends
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 1) - Swimlanes
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Formatting and sizing lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Adding shapes to lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Sizing containers
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 3) - The Other Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 2) - The Data Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 1) - The Format Properties of a Control
- Microsoft Access 2010 : Form Properties and Why Should You Use Them - Working with the Properties Window
- Microsoft Visio 2013 : Using the Organization Chart Wizard with new data
- First look: Apple Watch

- 3 Tips for Maintaining Your Cell Phone Battery (part 1)

- 3 Tips for Maintaining Your Cell Phone Battery (part 2)
programming4us programming4us